Search CORE

893 research outputs found

Exploring Attention Based Model for Captioning Images

Author: Xu Kelvin
Publication venue
Publication date: 01/12/2017
Field of study

Comprendre ce qu’il y a dans une image est l’enjeu primaire de la vision par ordinateur. Depuis 2012, les réseaux de neurones se sont imposés comme le modèle de facto pour de nombreuses applications d’apprentissage automatique. Inspirés par les récents travaux en traduction automatique et en détection d’objet, cette thèse s’intéresse aux modèles capables de décrire le contenu d’une image et explore comment la notion d’attention peut être parametrisée par des réseaux de neurones et utilisée pour la description d’image. Cette thèse presente un reseau de neurones base sur l’attention qui peut décrire le contenu d’images, et explique comment apprendre ce modèle de facon déterministique par backpropagation ou de facon stochastique avec de l’inférence variationnelle ou de l’apprentissage par renforcement. Etonnamment, nous montrons que le modèle apprend automatiquement a concentrer son attention sur les objets correspondant aux mots dans la phrase prédite. Cette notion d’attention obtient l’état de l’art sur trois benchmarks: Flickr9k, Flickr30k and MS COCO.Understanding the content of images is arguably the primary goal of computer vision. Beyond merely saying what is in an image, one test of a system's understanding of an image is its ability to describe the contents of an image in natural language (a task we will refer to in this thesis as \image captioning"). Since 2012, neural networks have exploded as the defacto modelling tool for many important applications in machine learning. Inspired by recent work in machine translation and object detection, this thesis explores such models that can describe the content of images. In addition, it explores how the notion of \attention" can be both parameterized by neural networks and usefully employed for image captioning. More technically, this thesis presents a single attention based neural network that can describe images. It describes how to train such models in a purely deterministic manner using standard backpropagation and stochastically by considering techniques used in variational inference and reinforcement learning. Surprisingly, we show through visualization how the model is able to automatically learn an intuitive gaze of salient objects corresponding to words in the output sequence. We validate the use of an attention based approach with state-of-the-art performance three benchmark datasets: Flickr9k, Flickr30k and MS COCO

Dépôt Institutionnel Numérique

A Physiological Role for Amyloid Beta Protein: Enhancement of Learning and Memory

Author: John Morley
Kelvin A. Yamada
Lin Xu
Steven N. Johnson
Susan Farr
William Banks
Publication venue
Publication date: 24/07/2008
Field of study

Amyloid beta protein (A[beta]) is well recognized as having a significant role in the pathogenesis of Alzheimer's disease (AD). The reason for the presence of A[beta] and its physiological role in non-disease states is not clear. In these studies, low doses of A[beta] enhanced memory retention in two memory tasks and enhanced acetylcholine production in the hippocampus _in vivo_. We then tested whether endogenous A[beta] has a role in learning and memory in young, cognitively intact mice by blocking endogenous A[beta] in healthy 2-month-old CD-1 mice. Blocking A[beta] with antibody to A[beta] or DFFVG (which blocks A[beta] binding) or decreasing A[beta] expression with an antisense directed at the A[beta] precursor APP all resulted in impaired learning in T-maze foot-shock avoidance. Finally, A[beta]1-42 facilitated induction and maintenance of long term potentiation in hippocampal slices, whereas antibodies to A[beta] inhibited hippocampal LTP. These results indicate that in normal healthy young animals the presence of A[beta] is important for learning and memory

Nature Precedings

Learning Tasks for Multitask Learning: Heterogenous Patient Populations in the ICU

Author: Buolamwini Joy
Caruana Rich
Ghassemi Marzyeh
Ngufor C.
Shankar Shreya
Wang Xiang
Xu Kelvin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/06/2018
Field of study

Machine learning approaches have been effective in predicting adverse outcomes in different clinical settings. These models are often developed and evaluated on datasets with heterogeneous patient populations. However, good predictive performance on the aggregate population does not imply good performance for specific groups. In this work, we present a two-step framework to 1) learn relevant patient subgroups, and 2) predict an outcome for separate patient populations in a multi-task framework, where each population is a separate task. We demonstrate how to discover relevant groups in an unsupervised way with a sequence-to-sequence autoencoder. We show that using these groups in a multi-task framework leads to better predictive performance of in-hospital mortality both across groups and overall. We also highlight the need for more granular evaluation of performance when dealing with heterogeneous populations.Comment: KDD 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Relational Collaborative Filtering:Modeling Multiple Item Relations for Recommendation

Author: Ai Qingyao
He Xiangnan
Kang Wang-Cheng
Lee Joonseok
Lin Yankai
Rendle Steffen
Xu Kelvin
Publication venue
Publication date: 01/01/2019
Field of study

Existing item-based collaborative filtering (ICF) methods leverage only the relation of collaborative similarity. Nevertheless, there exist multiple relations between items in real-world scenarios. Distinct from the collaborative similarity that implies co-interact patterns from the user perspective, these relations reveal fine-grained knowledge on items from different perspectives of meta-data, functionality, etc. However, how to incorporate multiple item relations is less explored in recommendation research. In this work, we propose Relational Collaborative Filtering (RCF), a general framework to exploit multiple relations between items in recommender system. We find that both the relation type and the relation value are crucial in inferring user preference. To this end, we develop a two-level hierarchical attention mechanism to model user preference. The first-level attention discriminates which types of relations are more important, and the second-level attention considers the specific relation values to estimate the contribution of a historical item in recommending the target item. To make the item embeddings be reflective of the relational structure between items, we further formulate a task to preserve the item relations, and jointly train it with the recommendation task of preference modeling. Empirical results on two real datasets demonstrate the strong performance of RCF. Furthermore, we also conduct qualitative analyses to show the benefits of explanations brought by the modeling of multiple item relations

arXiv.org e-Print Archive

Crossref

Enlighten

Context-Aware Embeddings for Automatic Art Analysis

Author: Bar Yaniv
Carlson Andrew
Garcia Noa
Krishna Ranjay
Marino Kenneth
Simonyan K.
Strezoski Gjorgji
Xu Kelvin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/04/2019
Field of study

Automatic art analysis aims to classify and retrieve artistic representations from a collection of images by using computer vision and machine learning techniques. In this work, we propose to enhance visual representations from neural networks with contextual artistic information. Whereas visual representations are able to capture information about the content and the style of an artwork, our proposed context-aware embeddings additionally encode relationships between different artistic attributes, such as author, school, or historical period. We design two different approaches for using context in automatic art analysis. In the first one, contextual data is obtained through a multi-task learning model, in which several attributes are trained together to find visual relationships between elements. In the second approach, context is obtained through an art-specific knowledge graph, which encodes relationships between artistic attributes. An exhaustive evaluation of both of our models in several art analysis problems, such as author identification, type classification, or cross-modal retrieval, show that performance is improved by up to 7.3% in art classification and 37.24% in retrieval when context-aware embeddings are used

arXiv.org e-Print Archive

Crossref